AITopics | code slm

Collaborating Authors

code slm

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Revisiting Pre-trained Language Models for Vulnerability Detection

Li, Youpeng, Qi, Weiliang, Wang, Xuyu, Yu, Fuxun, Wang, Xinda

arXiv.org Artificial IntelligenceNov-25-2025

The rapid advancement of pre-trained language models (PLMs) has demonstrated promising results for various code-related tasks. However, their effectiveness in detecting real-world vulnerabilities remains a critical challenge. While existing empirical studies evaluate PLMs for vulnerability detection (VD), they suffer from data leakage, limited scope, and superficial analysis, hindering the accuracy and comprehensiveness of evaluations. This paper begins by revisiting the common issues in existing research on PLMs for VD through the evaluation pipeline. It then proceeds with an accurate and extensive evaluation of 18 PLMs on high-quality datasets that feature accurate labeling, diverse vulnerability types, and various projects. Specifically, we compare the performance of PLMs under both fine-tuning and prompt engineering, assess their effectiveness and generalizability across various training and testing settings, and analyze their robustness to a series of perturbations. Our findings reveal that PLMs incorporating pre-training tasks designed to capture the syntactic and semantic patterns of code outperform both general-purpose PLMs and those solely pre-trained or fine-tuned on large code corpora. However, these models face notable challenges in real-world scenarios, such as difficulties in detecting vulnerabilities with complex dependencies, handling perturbations introduced by code normalization and abstraction, and identifying semantic-preserving vulnerable code transformations. Also, the truncation caused by the limited context windows of PLMs can lead to a non-negligible number of labeling errors, which is overlooked by previous work. This study underscores the importance of thorough evaluations of model performance in practical scenarios and outlines future directions to help enhance the effectiveness of PLMs for realistic VD applications.

large language model, machine learning, vulnerability detection, (21 more...)

arXiv.org Artificial Intelligence

2507.16887

Country:

Europe (1.00)
Asia (1.00)
North America > United States > California (0.92)

Genre: Research Report > New Finding (1.00)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Retrieval-Augmented Instruction Tuning for Automated Process Engineering Calculations : A Tool-Chaining Problem-Solving Framework with Attributable Reflection

Sakhinana, Sagar Srinivas, Sannidhi, Geethan, Runkana, Venkataramana

arXiv.org Artificial IntelligenceAug-28-2024

The current technology landscape lacks a foundational AI model for solving process engineering calculations. In this work, we introduce a novel autonomous agent framework leveraging Retrieval-Augmented Instruction-Tuning (RAIT) to enhance open, customizable small code language models (SLMs) for these calculations. By combining instruction tuned code SLMs with Retrieval-Augmented Code Generation (RACG) using external tools, the agent generates, debugs, and optimizes code from natural language specifications. Our approach addresses the limitations of the current lack of a foundational AI model for specialized process engineering tasks and offers benefits of explainability, knowledge editing, and cost-effectiveness. Additionally, we curate custom datasets of chemical and process engineering problems and solutions to overcome data scarcity. Experimental results show that our framework matches the performance of large-scale proprietary models on benchmark datasets, proving its effectiveness and usability.

calculation, code slm, language model, (12 more...)

arXiv.org Artificial Intelligence

2408.15866

Genre: Research Report (0.70)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (0.96)

Add feedback

Narrow Transformer: Starcoder-Based Java-LM For Desktop

Rathinasamy, Kamalkumar, J, Balaji A, Kumar, Ankush, Gayari, Gagan, K, Harshini, Mondal, Rajab Ali, S, Sreenivasa Raghavan K, Singh, Swayam

arXiv.org Artificial IntelligenceJul-4-2024

The state-of-the-art code models, capable of understanding and generating code in numerous programming languages, are revolutionizing the way enterprises approach software development. With the ability to understand and generate code across a vast array of programming languages, these code models offer a significant boost in productivity. However, the one-size-fits-all approach of these generic multi-lingual code models often falls short in meeting the nuanced requirements of project-level coding tasks in an enterprise, which tend to be language-specific. This has led to the development of Narrow Transformers (NTs), specialized models further trained on a particular programming language, offering a more efficient solution for enterprises. These NTs are designed to optimize performance for a specific programming language, balancing the trade-offs between model size, inferencing cost, and operational throughput. As demand for tailored solutions grows, we can expect a surge in NT development, providing the precision and efficiency required by enterprise projects. However, in practice, the substantial economic cost associated with training and fine-tuning large code models renders language model experiments prohibitively expensive for most researchers and organizations.

dataset, nt-java-1, programming language, (14 more...)

arXiv.org Artificial Intelligence

2407.03941

Country:

North America > United States > California > San Diego County > San Diego (0.04)
Africa > Rwanda > Kigali > Kigali (0.04)

Genre: Research Report > New Finding (0.46)

Industry: Information Technology (0.52)

Technology:

Information Technology > Software > Programming Languages (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.48)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.47)

Add feedback